232 research outputs found
Revisiting the Polyak step size
This paper revisits the Polyak step size schedule for convex optimization
problems, proving that a simple variant of it simultaneously attains near
optimal convergence rates for the gradient descent algorithm, for all ranges of
strong convexity, smoothness, and Lipschitz parameters, without a-priory
knowledge of these parameters
A Spectral Algorithm for Learning Hidden Markov Models
Hidden Markov Models (HMMs) are one of the most fundamental and widely used
statistical tools for modeling discrete time series. In general, learning HMMs
from data is computationally hard (under cryptographic assumptions), and
practitioners typically resort to search heuristics which suffer from the usual
local optima issues. We prove that under a natural separation condition (bounds
on the smallest singular value of the HMM parameters), there is an efficient
and provably correct algorithm for learning HMMs. The sample complexity of the
algorithm does not explicitly depend on the number of distinct (discrete)
observations---it implicitly depends on this quantity through spectral
properties of the underlying HMM. This makes the algorithm particularly
applicable to settings with a large number of observations, such as those in
natural language processing where the space of observation is sometimes the
words in a language. The algorithm is also simple, employing only a singular
value decomposition and matrix multiplications.Comment: Published in JCSS Special Issue "Learning Theory 2009
- …